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(54) Automatic rejection of diffraction effects in thin film metrology 


(57) A layer thickness determination system (10) is 
employed for detecting a thickness of at least one layer 
(12a) disposed over a surface of a wafer (13) having 
one or more first regions characterized by circuit and 
other features, and one or more second regions char- 
acterized by an absence of circuit and other features. 
The determination system (10) includes an optical sys- 
tem (14) for collecting light reflecting from the at least 
one layer (12a) and the surface of the wafer (13). The 
optical system (14) is preferably a telecentric optical 
system, and has an optical axis (OA; 1 4a) and a narrow 
cone of acceptance angles disposed about the optical 
axis (OA; 14a). The determination system (10) further 
includes a camera (16) coupled to the optical system 
(14) for obtaining an image from the collected light; a 
first light source (22) for illuminating the layer (12a) with 
light that is directed along the optical axis (OA: 1 4a) and 


within the cone of acceptance angles; and at least one 
second light source (18) for illuminating the layer (12a) 
with light that is directed off the optical axis (OA; 14a) 
and outside of the cone of acceptance angles A data 
processor has an input coupled to an output of the cam- 
era (16) for obtaining from the camera (16) first pixel 
data corresponding to at least one first image obtained 
with light from the first light source (22) and for obtaining 
second pixel data corresponding to at least one second 
image obtained with light from the at least one second 
light source (18). The data processor operates to gen- 
erate an image mask from the second pixel data for dis- 
tinguishing the first wafer regions from the second wafer 
regions, and further operates to detect a thickness of 
the at least one layer (12a) within the second regions in 
accordance with the first pixel data and in accordance 
with predetermined referegce pixel data (Fig. 1). 
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Description 

FIELD OF THE INVENTION: 

This invention relates generally to optical metrology methods and apparatus and, in particular, this invention relates 
to methods and apparatus for measuring a thickness of one or more thin layers or films that are disposed upon or over 
a surface of a supporting substrate. 

BACKGROUND OF THE INVENTION: 

In commonly assigned U.S. Patents 5,291 ,269 (3/1/94), 5,293,214 (3/8/94) and 5,333,049 (7/26/94) the inventor 
discloses methods and apparatus for determining a thickness of a layer of material. For example, in the '049 patent 
the inventor discloses a full aperture measurement instrument for determining a thickness of a layer disposed on a 
substrate. The substrate may be a semiconductor wafer. A light source is used for illuminating a surface of the layer 
and a CCD camera is employed for obtaining an image of the illuminated surface. The image obtained from the camera 
is converted into a map of measured reflectance data, which is subsequently compared to reference reflectance data 
The result is the generation of a map describing a thickness profile of the layer. 

Although the techniques described in these commonly assigned U.S. Patents are very well suited for their intended 
applications, a problem is created when the underlying surface of the wafer is patterned, as is typically the case when 
a semiconductor wafer is being processed to form integrated circuits. In this case the illumination that passes through 
the surface layer, which may be a layer of Si0 2 used as a Chemical-Mechanical Polishing (CMP) layer is scattered 
and diffracted by the underlying circuit features. These features often take the form of short, repetitive linear structures 
which, due to their small size and close spacings, can function as wavelength selective diffraction gratings The scat- 
tering and diffraction of the illumination results in a significant reduction in the amount of illumination that reaches the 
camera, often by as much as 30% to 50%, over the case where the underlying substrate or wafer is smooth and not 
patterned. Furthermore, the optical system for a full aperture thickness measurement system is typically incapable of 
resolving the micron and sub-micron sized circuit features. In addition, the patterning of the wafer surface varies widely 
over the surface, depending on what type of integrated circuit structures are being fabricated within a given area As 
a result, it is very difficult to accurately model the optical behavior of the wafer/layer system, thus severely complicating 
the task of generating accurate reference reflectance data for use in comparing to an obtained image 

Presently available equipment that is used to measure film thicknesses on patterned wafers uses microscope 
objectives which view only one small region on the wafer. The presently available equipment has many other disad- 
vantages. 

One disadvantage results from the use of a microscope which provides high magnification, but at the expense of 
a small f-number and correspondingly large light collection angle. The large light collection angle allows diffracted and 
scattered radiation to enter the optical system and to interfere with the specular reflection from the film layers being 
measured. It is extremely difficult or impossible to separate these two contributions, since they both are detected by 
the same optical detector. 

A second disadvantage results from a requirement to locate the microscope objective at a precise point in the field 
so as to avoid diffracting areas. This requires the use of an accurate x-y wafer positioning stage, and further requires 
precise knowledge of the details of the spatial arrangement of the circuit features 

In addition, currently available systems do not utilize the benefits that accrue from image collection or image 
processing in order to enhance the measurement of thin layers or films. 

The aforementioned CMP layer is typically applied so as to planarize the surface of the wafer as it is processed 
As circuit features are incrementally formed during wafer processing the height of the surface of the wafer tends to 
vary widely This variation in height over the wafer surface complicates the subsequent accurate placement of further 
circuit structures. Also, focussing becomes more difficult resulting in lower chip yields. To overcome these problems it 
is known to deposit a dielectric layer, such as a CV deposited layer of Si0 2 . and to then chemically and mechanically 
polish the dielectric layer (i.e., the CMP layer), thus providing a smooth and uniform electrically insulating layer upon 
which to continue to form further circuit structures. In this case apertures are made through the CMP layer as required 
to contact already formed circuit features. In order to accurately planarize the CMP layer it is thus required to accurately 
know the thickness of the CMP layer, at a plurality of locations, so as not to remove too much of the CMP layer If this 
were to occur the destruction of the underlying circuits could result. Even if the underlying circuits arc not damaged if 
the CMP layer is made too thin the dielectric characteristics of the CMP layer may be impaired, resulting in short circuits 
developing between circuit features located above and below the CMP layer. 

It can be appreciated that a semiconductor wafer in an intermediate stage of processing can represent a very 
significant investment in both processing time and money It can therefore further be appreciated that it is an important 
requirement to accurately determine the thickness profile of the CMP layer. It is also important to accurately determine 
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the thickness profile of other types of intermediate layers that may be deposited on existing patterns for, by example 
quality control and diagnosis purposes. 

It is thus one object of this invention to accomplish the determination of the thickness profile of one or more layers 
or films in a rapid manner so as not to unduly impact the throughput of a semiconductor fabrication line. It is a further 
object of this invention to accomplish the determination of the thickness profile of one or more layers or films without 
requiring a priori knowledge of the types of underlying scattering and diffracting features, their geometry, or their loca- 
tions (both spatial and/or angular), and to also not require the use of precise positioning tables and the like. 

SUMMARY OF THE INVENTION 


The foregoing and other problems are overcome and the objects of the invention are realized by a method of 
optically cataloging the surface of a patterned wafer into regions having different optical scatter properties, so that 
errors in the computation of thickness or optical constant maps can be reduced. A first level of screening separates 
purely planar film system regions from those containing scattering and diffracting features. Further differentiation be- 
'5 tween different planar film designs at different places on the wafer is accomplished by using different numerical spectral 
libraries, and a subsequent determination of which spectral library provides a best tit (lowest merit function) over a 
given area. Typically the measurements are made with prior knowledge of the type of film systems to expect but not 
necessarily where individual areas are located or how they are aligned. 

Differentiating between areas which have different diffractive and scattering signatures (different circuit patterns, 
20 line directions, etc.) may be accomplished using multiple libraries which include a coherent coupling between the film 
surfaces and the diffracting and scattering structures. In this case library computation requires knowledge of the circuit 
spatial details and their orientation, as well as the optical properties (e.g., index of refraction at various wavelengths) 
of the materials. 

The teaching of this invention enables the measurement process for planar layers to occur in a rapid manner and 

25 does not require that the wafer be precisely positioned under the optical measurement system. 

The teaching of this invention employs a high resolution, narrow field of view multispectral full aperture imaging 
system which incorporates an automatic system for determining which regions on a wafer contain planar areas and 
which regions contain scattering and diffracting regions. By example, two white light images, taken with oblique illumi- 
nation in orthogonal directions, are used to create a binary image mask which is used to prevent thickness computations 

30 from being carried out in areas which contain circuit features of any type (e.g.. edges, rectangles, sub-micron features, 
die edge lines, etc.). This technique speeds the measurement process and significantly reduces the number of erro- 
neous values resulting from scattering, diffraction, and defects, since these areas are automatically avoided. 

A layer thickness determination system in accordance with this invention is employed for detecting a thickness of 
at least one layer disposed over a surface of a wafer having one or more first regions characterized by circuit and other 

35 features, and one or more second regions characterized by an absence of circuit and other features. The system 
includes an optical system for collecting light reflecting from the at least one layer and the surface of the wafer. The 
optical system is preferably a telecentric optical system, and has an optical axis and a narrow cone of acceptance 
angles disposed about the optical axis. The system further includes a camera coupled to the optical system for obtaining 
an image from the collected light; a first light source, which includes filters, for illuminating the layer with substantially 

-to monochromatic light that is directed along the optical axis and within the cone of acceptance angles; and at least one 
second light source for illuminating the layer with light that is directed off the optica! axis and outside of the cone of 
acceptance angles. A data processor has an input coupled to an output of the camera for obtaining from the camera 
first pixel data corresponding to at least one first image obtained with light from the first light source and for obtaining 
second pixel data corresponding to at least one second image obtained with light from the at least one second light 

•*s source. The data processor operates to generate an image mask from the second pixel data for distinguishing the first 
wafer regions from the second wafer regions, and further operates to detect a thickness of the at least one layer within 
the second regions in accordance with the first pixel data and in accordance with predetermined reference pixel data. 

The first multispectral light source provides illumination with a first incidence angle on the layer, the first incidence 
angle being an angle that causes specularly reflected light to enter the cone of acceptance angles of the optical system. 

so The second light source (which may a part of the first light source) provides illumination with a second incidence angle 
on the layer, the second incidence angle being an angle that causes specularly reflected light to not enter the cone of 
acceptance angles of the optical system. 


55 


BRIEF DESCRIPTION OF THE DRAWINGS 

The above set forth and other features of the invention are made more apparent in the ensuing Detailed Description 
of the Invention when read in conjunction with the attached Drawings, wherein: 
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Fig. 1 is a simplified block diagram illustrating a film thickness measurement system in accordance with this in- 
vention: 

Fig. 2 illustrates process steps in accordance with a prior art method of measuring film thickness for a film disposed 
5 on a planar, unpatterned substrate: 

Fig. 3 illustrates process steps in accordance with this invention for measuring film thickness for a film disposed 
on a patterned substrate: 

10 Figs. 4a-4e depict images of a patterned substrate having a film layer, and illustrate a method for deriving a mask 

to select areas that avoid scattering and diffracting patterns; 

Fig. 4f is an enlarged cross-sectional view, not to scale, that illustrates the use of a planar reference surface within 
the field of view of the camera of Fig. 1 ; 
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Fig. 4g is an enlarged cross-sectional view that illustrates a portion of a wafer and two Si0 2 films; 

Fig. 5a illustrates a thickness map for a portion of a Si0 2 layer on a patterned substrate without the use of the 
mask generated in accordance with this invention; 

Fig. 5b illustrates a thickness map for the portion of the Si0 2 layer on the patterned substrate with the use of the 
mask generated in accordance with this invention; 

Fig. 6a is a histogram corresponding to the unmasked thickness map of Fig. 5a: 
Fig. 6b is a histogram corresponding to the masked thickness map of Fig. 5b: 

Fig. 7a is an exemplary 2.06 micron layer thickness map obtained by masking and thresholding in accordance 
with this invention; 

Fig. 7b is an exemplary 1.33 micron layer thickness map obtained by masking and thresholding in accordance 
with this invention: 

Fig. 8a is a merit function map corresponding to the unmasked thickness map of Fig. 5a; 
Fig. 8b is a merit function map corresponding to the masked thickness map of Fig. 5b; 

Fig. 9 is a logic flow diagram illustrating the use of a plurality of planar and diffracting libraries in accordance with 
an aspect of this invention; 

Fig. 10a is diagram of a conventional imaging system; 

Fig. 1 0b is a diagram of a telecentric optical system that is a presently preferred embodiment of an imaging system 
for use with this invention; 

Fig. Ila is a block diagram of a thickness determination system in accordance with a first embodiment of this 
invention; and 

Fig. 11b illustrates a filter wheel for use in a second embodiment of this invention. 
DETAILED DESCRIPTION OF THE INVENTION 


It is first noted that the teaching of this invention is applicable in general to any structured optical pattern, whether 
it be, by example, a patterned wafer, a liquid crystal display, or a biological sample. As such, and although the invention 
55 is described herein in the context of a semiconductor wafer fabrication application, the teaching of this invention is not 
to be construed to be limited in scope to only the determination of a film thickness upon a semiconductor wafer. The 
terms "film" and "layer" are used interchangeably herein, and are both intended to encompass a region comprised of 
a first material (e.g., Si0 2 ) that is disposed upon or over a second material (e.g., Si), wherein the first material has at 
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least one optical characteristic (e.g.. index of refraction) that is different than that of the second material. 

Patterned wafers at first glance contain a bewildering number of features, patterns and sub-patterns having di- 
mensions that range in size from sub-micron to some hundreds of microns. However, most of the sub-patterns are 
extremely repetitive and contain a limited number of line segment angles. The inventor has recognized that regions 

5 on a wafer can be considered to fall into two broad categories. 

A first category includes areas that contain film structures which consist of only planar layers. These regions typ- 
ically are the scribe alleys between dies and the areas between blocks of sub-micron features. Areas free of circuit 
structures may also occur during each step of the die fabrication process. The measurement of this type of region can 
be accomplished using multispectral reflectometry combined with a search of pre-computed libraries, as disclosed in 

io the above-mentioned commonly assigned U.S. Patents 5,291,269. 5.293,214 and 5.333.049. which are each incor- 
porated by reference herein in their entireties. 

A second category of wafer regions contain circuit details, such as sub-micron integrated circuit structures, metal 
traces, die lines, capacitors, and also local defects caused by processing. This second category ol wafer regions is 
more difficult to characterize because the combination of one or more thin film layers, that are coherently coupled to 

is such microscopic patterns, alters the phase and amplitude of the reflected light. In addition, these spatial variations 
tend to scatter and diffract light in non-specular directions. Numerical libraries pre-computed for this type of region 
require a knowledge of the material optical properties, as well as a knowledge of the mask patterns used at each stage 
of manufacture. 

As employed herein the term 'coherent coupling' has the following meaning. If two optical surfaces have intensity 
20 transmissions of Tl and T2, then the combined transmission is simply T1 limes T2. However if the surfaces are co- 
herently coupled together, then the resultant transmission is no longer the simple product, but depends instead on the 
amplitude and the phase of the transmission coefficient at each surface as well as on the distance between the two 
surfaces. 

A basic principle of this invention is illustrated in the optical system 10 shown in Fig. 1, where a surface 12 to be 
25 measured is viewed at normal incidence using an optical system 14 having a large F/number. As a result, the only light 
which can reach a multi-pixel CCD camera 16 is that which leaves the surface 12 within a few degrees of the optical 
axis 14a of the optical system 14, that is, light having angles within an acceptance cone of the optical system 14. It is 
noted that non-normal incidence can also be employed, providing that the image plane of the camera 16 is tilted and 
the illumination sources are correctly positioned. 
30 The optical system 1 4 is preferably a full aperture telecentric system having a narrow cone of acceptance angles 

Referring briefly to Fig. 10a there is illustrated a conventional imaging system used with a camera focal plane. A lens 
is disposed above a surface to be imaged at the camera focal plane. In the conventional imaging system the central 
rays (CR) are not perpendicular to the surface being imaged. As a result, different points in the field of view (FOV) of 
the camera are not imaged in the same manner. The conventional imaging system is characterized by a working 
35 distance (WD) measured in millimeters, a FOV (F/1 ) that is measured in tens of degrees, and a depth of field measured 
in microns. 

In contradistinction, and referring to Fig. 10b, a telecentric imaging system is characterized by two lenses (L1 and 
L2), and an aperture (AP) that is located one focal length from L2. In the telecentric system the chief central rays (CR) 
are all parallel and perpendicular to the surface being imaged. The telecentric imaging system is characterized by a 
•to working distance (WD) measured in centimeters, a FOV that of approximately one degree (i.e., a narrow cone of 
acceptance angles), and a depth of field measured in millimeters. 

Referring again to Fig 1 , a white (broadband) light source 18 and condensing lens 20 provide illumination of the 
surface 12 under test. If the surface 12 under test is illuminated with light incident at an angle greater than half the 
acceptance angle of the optical system 1 4, and if the surface 1 2 contains only planar layers, then none of the specularly 
•*5 reflected light enters the optical system and the image intensity corresponds to the black level of the CCD camera 16. 
That is, no image is detected. This specularly reflected light is indicated generally as 18a. 

If, on the other hand, the surface 12 being viewed contains micron and sub-micron sized patterns of any type, then 
some light (designated as 18b) will be deviated by diffraction and scattering into the narrow acceptance cone of the 
optical system 14 and, hence, to the camera 16. The resulting detected pixel image thus includes bright areas cone- 
so sponding to scattering and diffracting features, such as edges, that are embedded in the patterned regions. 

In practice the optical system 14 shown in Fig. 1 also includes an on-axis filtered light source 22 and a beamsplitter 
(not shown in Fig. 1) placed between the surface 12 under test and the camera 16. This permits the recording and 
digitization of multispectral images which are used to measure the optical spectra at each pixel location, as is described 
in detail below. 

55 Diffraction occurs when light illuminates a surface where the complex refractive index changes abruptly over the 

surface. The amount of light diffracted in an optical system is approximately proportional to the total length of an illu- 
minated edge, multiplied by both the wavelength of the light and the intensity of the light beam (watts/cm). In conven- 
tional optical systems this is typically an extremely small amount since only a few apertures are illuminated The op- 
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posite is true, however, in the case of patterned wafers, where the total length of the lines (corresponding to circuit 
features) which can diffract light is enormous. The large number of edge features results in large amounts of incident 
light being redirected in non-specular directions. 

If images of the patterned wafer surface are recorded while being illuminated from at least one. and preferably two 

s or more different directions using at least one and possibly a large range of incident angles, wavelengths and polari- 
zations, then a mask can be created from these digitized images. The mask is then used to differentiate those regions 
of the wafer surface that are free of scattering and diffracting features from those regions that are not. In accordance 
with this invention, the application of the mask identifies those regions where specular reflection can be used to detect 
the thickness of the planar layer(s), as in the aforementioned U.S. Patent 5.333.049. In the simplest case the mask is 

io a two state (binary) mask, but in general multiple (three or more) level masks can be obtained depending upon the 
optical effects to be masked. 

In the case of the two state (i.e., binary) mask, thickness computations using spectral libraries precomputed for 
pfanar layers are only valid in those regions of the mask where the diffracted light level is zero, i.e., the black regions. 
Planar spectral libraries of a type employed in the above-mentioned U.S. Patents 5.291 ,269, 5.293,214 and 5,333.049, 

15 which have been incorporated by reference herein in their entireties, are not in general valid in the bright regions since 
the spectral signature of such regions is strongly influenced by the diffractive losses. 

In consequence, a simple image mask, for example a binary mask, is used to reduce the number of points to be 
computed in order to determine a thickness profile of at least one layer, such as a CMP layer 12a, that overlies a 
patterned surface of a wafer 13. This results in increased processing speed and a reduction in errors in the resultant 

20 thickness map(s). since the diffracting areas are automatically screened out and eliminated from the film thickness 
determination. 

Reference is now made to Fig. 11a for showing in greater detail one presently preferred embodiment of this in- 
vention. A primary white light source (LS1 ) provides a beam to a condensing lens (LI ) which focusses the beam onto 
a filter wheel 24 having a plurality of different filters 24a each of which passes a different wavelength. Filtered light 
2S emanating from the filter wheel 24 is coliimated by a second lens (L2) and is directed to a 50/50 beamsplitter 26. Light 
reflecting from the beamsplitter 26 is directed along the optical axis 14a that is normal to the surface of the wafer 13. 
Light that is specularly reflected from the surface of the wafer 13 is collected by the telecentric optical system 14 and 
is focussed at the image plane of the CCD camera 16, as was described previously with regard to Fig. 10b. The output 
of the CCD camera 16 is provided to a conventional frame grabber 16a which stores, for each position of the filter 
30 wheel 24. the resulting image or light signature A complete frame comprised of pixel intensity values is read out by a 
processor 28a which stores the frame in a memory 28b. The processor 28a controls the position of the filter wheel 24 
so as to obtain N images of the wafer, wherein each image corresponds to illumination of the wafer with light of a 
predetermined wavelength as selected by the particular filter that is interposed in the path of the white light beam from 
source LS1 . At least one predetermined spectral library is stored in memory 28c, which is subsequently accessed and 
35 compared to the stored pixel values in the memory 28b. 

The spectral library 28c contains a description of a set of reflectance curves for a range of parameter values such 
as film thickness. For example, and referring to Fig. 4g, if there are two films or layers deposited onto a silicon wafer 
the films will have optical properties (n, k-, t,) and (n 2 k 2 t 2 ), where t 1 and t 2 are the thicknesses of films 1 and 2, 
respectively, n, and n 2 are the refractive indices of films 1 and 2, respectively, and k-, and k 2 are the absorption constants 
of films 1 and 2, respectively. The substrate also has a refractive index n s and an absorption constant k.. The index of 
refraction and absorption values are wavelength dependent, and are actually given by n(X), k(X) for alf films and the 
substrate. 

The reflectance (R) of light at wavelength X for film thicknesses and t 2 for the two materials can be written as: 
RfX^L,) = F(h, n s (X). k s (U 1 1 . n, (X), k, (X), t 2 , n 2 (X). k 2 (X)). 

If it is desired to measure the second film thickness t 2 , then a series of reflectances R v .R m can be pre-compuled 
at wavelengths X v .X m for each value of the unknown thickness parameter t 2 . 

All these spectrum are sampled data sets containing values. The library 28c therefore contains all possible reflect- 
ance (or transmission) spectrum (or some normalized version thereof) which are expected during a single measure- 
so ment. 

When a wafer is measured there is obtained a set of reflectance values (P) for wavelengths X v X? ... X m . For those 
reflectance values at the same m wavelength in the library 28c, the processor 28a determines which spectral pattern 
in the library 28c most closely matches the measured reflectance value. The thickness associated with a selected 
spectral pattern is thus correlated with the thickness of the film being measured. 
55 In the simplest calculation a merit function M(t 2 ) is derived which is a least squares function formed from the 

measured spectrum and one of the library spectra: 

M(t 2 ) = (P, - R, (t 2 )) 2 + (P 2 - R 2 ( t 2 )) 2 + .. (P m - R m (t 2 )) 2 
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Clearly, if the measured reflectance values P, .. P m were all exactly the same as the precomputed reflectances 
O2) P m (t2) for some unknown t 2 .. then M will be zero and a perfect match will have been found In practice, noise in 
the measurement system causes M to rarely equal zero. As a result, the goal is to determine a merit function M(t 2 ) 
having a minimum value. This is accomplished by finding the sum in the foregoing equation for all values of t 2 . and 
then choosing the t 2 value which gives the minimum merit function M(t 2 ). The selected t 2 value is thus taken to be the 
thickness of the layer 2 in Fig. 4g. 

Referring again to Fig. 11a, and in accordance with this invention, at least two additional light sources LS2 and 
LS3 are provided for illuminating the wafer at angles that are off of the optical axis (OA). The wafer images obtained 
with these two additional light sources are employed to detect those regions of the wafer surface that are subject to 
non-specular reflections, i.e., those regions having features that scatter and diffract the illumination. The scattering 
and diffracting images are also stored within the memory 28b, and are processed as described below to generate a 
binary mask defining those regions of the wafer wherein the data stored in the spectral library 28c may give incorrect 
thickness results. 

The two oblique light sources L2 and L3 may be replaced by a single source that surrounds the wafer 1 3. Alter- 
15 natively, and as is illustrated in Fig. 1 1 b, the filter wheel 24 may have two positions 25a and 25b wherein a feature 25c. 
such as a centrally located opaque region, causes the incident light to be diffracted away from the axis normal to the 
surface of the filter wheel 24. This can be seen by contrasting the normal ray A with the diffracted ray B. The result is 
that the ray B will strike the surface of the wafer 13 at an angle that diverges from the optical axis 14a, which is the 
desired result. By providing two regions 25a and 25b that bend the light in two different directions, it is possible to 
illuminate the wafer 13 from two different directions, as will be described below with respect to Figs. 4b, 4c and 4d. 
Representative dimensions for the regions 25a and 25b are a diameter of one inch, and a width of 0.75 inch for the 
centrally located opaque regions 25c. 

Fig. 2 illustrates the computation flow for a conventional multi-spectral case. In Block A N images are acquired at 
N different wavelengths. In Block B a computation rectangle is defined from the N images. In Block C a spectral curve 
is obtained for each pixel within the computational rectangle. The spectral curve indicates the amount of light reaching 
the camera for each of the N wavelengths. At Block D a pre-calculated spectral library (Block E) is accessed to obtain 
a best fit film thickness. This is a recursive process, with control flowing back to Block C until a best fit curve is obtained 
for all pixels in the computational rectangle. At Block F a thickness (t) is output from the processor 28a for each (x.y) 
pixel position within the computational rectangle. 
30 F«9 3 illustrates the computation flow in accordance with this invention, wherein only pixels in mask-specified 

planar layer regions are processed. In Block A N images are acquired at N different wavelengths. In Block B at least 
two scatter/diffraction images are acquired. In Block C the processor 28a generates a binary mask based on the scat- 
ter/diffraction images that are acquired in Block B. In Block D a computation rectangle is defined based on the mask 
obtained in Block C. In Block E a determination is made, for each pixel in the computation rectangle, whether the merit 
35 function (fvl) is equal to zero or one. If M=0 the pixel is rejected, while if M=1 the pixel is selected and the method 
continues to Block F where a spectral curve is obtained for each selected pixel within the computational rectangle. As 
before, the spectral curve indicates the amount of light reaching the camera for each of the N wavelengths. At Block 
G a pre-calculated spectral library (Block I) is accessed to obtain a best fit film thickness. At Block F a thickness (t) is 
output for each (x,y) selected pixel position within the computational rectangle. In accordance with this invention the 
•*o selected pixels are those that correspond to a planar layer region of the surface under test that is free of underlying 
scattering and diffracting features. 

Figs. 4a-4e illustrate the principle of the mask generation method for a patterned wafer which has been coated 
with at least one layer of silicon dioxide (Si0 2 ) as part of a planarization process. The images shown in Figs. 4a-4e 
are approximately 1/8th of the total image collected by the frame grabber 16a that is associated with the CCD camera 
■*s 16 of Fig. 1. 

It can be seen that the substantially monochromatic image (one of many taken during a measurement) shown in 
Fig. 4a does not provide any indication concerning which areas of the wafer contain scattering/diffracting regions and 
which areas contain only planar (spectrally reflecting) layers. The images shown in Figs. 4b and 4c are scatter images 
taken using the off-axis light source 18 at two different illumination directions (front and side). Essentially these two 
images are "dark field" images which are combined, pixel-by-pixel, to form a composite scattering and diffracting image 
that is shown in Fig. 4d. The two scatter images in this case were taken with the light sources 18 in approximately 
orthogonal directions. This is evident from the different orientations of the line pair images on the left side of the images 
in Figs. 4b and 4c. 

Referring to Fig 4g, these line pair images correspond to the horizontal and vertical edges of square apertures 
55 that were etched into a Si0 2 CMP layer before the entire wafer was coated with a second layer of Si0 2 . Off-axis 
illumination that strikes the edges is scattered and diffracted into the acceptance cone of the optical system 1 4, indicated 
by rays C and D. and are detected by the camera 1 6. Those rays that strike only planar film regions, without underlying 
substrate circuit features, are specularly reflected (rays A and B), and do not enter the acceptance cone of the optical 
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system 1 4. That is, for the specularly reflected rays A and B the angle of reflection is approximately equal to the angle 
of incidence, and the angle of incidence is predetermined so that the reflected rays do not enter the acceptance cone 
of the optical system 1 4. By example, if the acceptance cone of the imaging system 14 includes only rays that diverge 
by up to 1° from the optical axis, then an angle of incidence that is 2° from the optical axis (or 88° with respect to the 
surface of the wafer) is sufficient to insure that any specularly reflected rays will not reach the focal plane of the CCD 
camera 16. 

As a result, the edges associated with rays C and D contribute to the formation of 'bright' areas in the CCD camera 
image, while the planar regions associated with rays A and B do not contribute to the image and are thus "dark". 

The line pairs of Figs. 4b and 4c combine to outline the inverted islands in the composite image of Fig. 4d, which 
also indicates, by the presence of the dark central region, that there are no circuit details inside these squares. 

In general, this technique can employ a single source that is moved relative to the wafer, or can use multiple 
sources disposed at different locations, or can employ a ring or annular source that simultaneously applies off-axis 
illumination from all directions relative to the wafer. The source or sources may provide various wavelengths, polari- 
zations and incidence angles to enhance the light signature recording process, and may be used to code various image 
regions on the wafer surface. By example, and referring to Fig. 1. a polarizer plate 19 can be interposed within the 
beam of the off-axis light source 18. By rotating the polarizer plate 19 various polarization states can be introduced 
into the off-axis beam. 

As is seen in the detail of Fig. 4f, the camera field of view also contains a planar silicon reflecting surface 30 which 
provides a "black lever planar reference surface. This silicon structure may be thought of as a 'picture frame' that 
surrounds Ihe image of the silicon wafer. The off-axis illumination that strikes the reference surface <RS) experiences 
specular reflection (ray A) r just as does the ray B that strikes an unpatterned region of the silicon wafer, and is thus 
not detected by the camera 16. In contradistinction, the off -axis illumination that strikes a patterned region of the wafer 
experiences scattering and diffraction (rays C), and a portion of the scattered and diffracted illumination is detectable 
by the camera 1 8. The Si reference surface RS can be seen in the left-most region in all the images shown in Figs. 4a-4e. 

In accordance with an aspect of this invention the electronic intensity level in the area corresponding to the refer- 
ence surface RS is used to determine a level at which to threshold the images of Figs. 4b and 4c, i.e.. any pixels above 
the silicon reference level are set to 0.0 (black) and those below the silicon reference level are set to 1.0 (white). This 
thresholding of the image generates the binary mask 32 that is illustrated in Fig. 4e. In Fig. 4e those areas that appear 
black contain scattering and diffracting features, while those areas that appear white are free of such features; and are 
associated only with planar film syslems wherein accurate film thickness determinations can be made using pre-com- 
puted library functions. In some cases the edges of the resulting mask may be irregular. However, the irregularities 
can be removed by the use of known image processing techniques, such as filters, to remove isolated pixels. 

The collection of the images of Figs. 4b and 4c is preferably made during the acquisition of all the multi-spectral 
images, but before any thickness calculations are performed. 

It should be noted that the technique described thus far does not require any special wafer positioning or rotational 
alignment, other than that required to determine where the map is being measured for archival purposes. 

An examination of the various regions of the wafer image shown in Fig. 4a with a high power microscope capable 
of resolving submicron lines revealed that all of the areas which show scatter and diffraction contain circuit features. 
Most of the rectangular regions contain vertical line patterns. The empty rectangles on the left side (see Fig. 4g) do 
not contain features, and so with the exception of the surrounding edge regions, these squares contain only planar 
films as indicated by the corresponding bright portions of the mask of Fig. 4e. 

Thickness maps were generated lor the entire 600x64 pixel image of Fig. 4a using a numerical spectral library 
28a which contained spectra of a single Si0 2 layer (0 to 4.0 microns) on silicon, since it was known that the wafer had 
been coated with such a layer. Figs. 5a and 5b show a comparison of the thickness maps (32x300 points) obtained 
both with (Fig. 5a) and without (Fig. 5b) the use of the mask of Fig. 4e. In this case thicknesses were determined for 
each of the 38,400 pixels and the mask was then used as a multiplier. It can be seen that the unmasked map of Fig. 
5a contains numerous incorrect values, some of which are due to the edge of the silicon reference surface RS, but 
most of which are due to scattering and diffraction effects from the patterned wafer surface. The masked map of Fig. 
5b contains far fewer points, since the area under consideration includes large areas containing circuit details which 
so were excluded from the thickness determination. 

However, the thickness values of Fig. 5b can be shown to fall into two main bands, as is indicated when contrasting 
the histogram of Fig. 6a with that of Fig. 6b. In the unmasked case of Fig. 6a, 5% to 6% of the pixels give an incorrect 
answer, whereas the use of the mask 32 of Fig. 4c reduces the number of incorrect pixel values to less than 0.1%. 
The width of the histogram peaks also indicates the degree of film thickness uniformity over the field of view that was 
55 used (5mm x 0.5mm). 

It is apparent from the histogram data that two film thickness values are predominant, but these values are difficult 
to discern from a three dimensional plot Figs 7a and 7b show these two major components of the masked thickness 
map obtained by thresholding the data into two ranges (0.0 to 1 .9 microns) and ( 1 9 to 2.5 microns), and further illustrate 
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the clarity that is introduced when the masking technique is used. In Fig. 7a the rectangularly shaped regions on the 
left correspond to the top of the apertures shown in Fig 4g, while in Fig 7b the rectangularly shaped regions on the 
left correspond to the film layer within the bottom of each of the apertures in Fig. 4g. 

Although the masking technique does effectively catalog the patterned wafer surface into planar and diffracting 
regions, it is still possible for there to be several different thin film stacks present within a given area of a wafer. If it is 
assumed that the two different film thicknesses indicated by the measurement described earlier were actually caused 
by depositing a film of Si02 onto two structures, in one case bare silicon and the other case a titanium nitride film on 
bare silicon, then the thickness determining algorithm would have computed thickness values for both sets of pixels, 
one set correct, the other incorrect. Fortunately, in cases like this the merit function, which describes the goodness of 
the fit, and which is automatically calculated during the thickness determination, is very useful in discriminating against 
different film stack designs. That is, different film systems over different regions exhibit merit functions having different 
average values over the different thin film systems. The correct thicknesses correspond to the lowest average merit 
function since a "perfect fit" would give zero merit function values. 

The merit function maps corresponding to the measurement previously described are shown in Figs. 8a and 8b, 
'5 where the vertical scale is in percent representing the least squares difference in reflectance between the measured 
and pre-computed spectra over all the wavelengths used in the measurement. Fig. 8a shows the unmasked merit map 
with values ranging up to 1%, and clear shows the different average merit function values at different places on the 
wafer. In contrast, Fig. 8b shows the merit function only over the regions allowed by the mask, and clearly shows that 
the average values for the merit function are approximately the same over regions where the computed film thicknesses 
fall into two bands. This is a strong indication that the film materials and film slacks are identical, except for thickness 
variations in the two regions. 

Further in accordance with this invention a method for creating a catalog for the different regions expected to be 
found on a patterned wafer or, by example, an LCD flat screen display, is illustrated in Fig. 9. A main division between 
planar and scattering/diffracting regions is augmented by using goodness of fit maps to distinguish between different 
planar stacks or different pattern/film stack combinations. It should be noted that the two libraries (planar and diffracting) 
are not independent, and that the measurement data obtained from planar layer regions can also be used to comple- 
ment analysis of the diffracting regions. The predominant errors in precomputing the numerical libraries for planar 
regions are caused by uncertainties in the optical constants for the different layers. Except for the well known list of 
certain materials, namely silicon, steam grown Si0 2 . air and water, most materials used in semiconductor manufacture 
30 exhibit optical properties which depend upon the deposition process used. 

Computing the numerical libraries for the various diffracting regions found on a wafer requires that the coherent 
coupling interactions between the patterned layers be included. This in turn requires a priori knowledge of the compo- 
nent shapes ("shapetets") of the pattern, since these tend to be repetitive and can be treated as arrays of shapelets 
in the calculations. 

35 in Fig. 9, having established the mask for a given wafer or portion of a wafer as described above (Block A), those 

regions designated as planar regions are processed with the established planar libraries for the various types of film 
stacks, while those regions designated as diffracting (i.e., regions having circuit features) are processed with the es- 
tablished diffracting libraries for discrete patterned wafer regions. In this manner a thickness profile that incorporates 
both planar and patterned wafer regions is obtained. 
•*o The thickness measurements described herein are for what could be termed an uncooperative wafer, i.e., no prior 

knowledge was available as to the circuit patterns other than the fact that the entire surface had been coated with a 
planarizmg layer of Si0 2 . In practice prior knowledge of the various film structures and possibly the local geometric 
patterns of circuits structures can be accurately determined prior to making the thickness determination. 

Although described in the context of presently preferred embodiments of this invention, it should be understood 
^5 that a number of modifications can be made to these embodiments, and that these modifications will fall within the 
scope of the teaching of the invention. By example, if a GaAs wafer is being imaged it may be preferable to also employ 
GaAs as the reference surface (RS) material for the pixel thresholding operation. Also by example, the filter wheel 24 
can be replaced with a moveable grating or a prism for providing illumination with multiple wavelengths. 

Thus, while the invention has been particularly shown and described with respect to preferred embodiments there- 
of, it will be understood by those skilled in the art that changes in form and details may be made therein without departing 
from the scope and spirit of the invention. 
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Claims 

55 

1 . An apparatus for detecting a thickness (t) of at least one layer (1 2a) disposed over a surface of a wafer ( 1 3) having 
one or more first regions provided with circuit and other features, and one or more second regions showing an 
absence of circuit and other features, comprising: 
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an optical system (14) for collecting light reflecting from said at least one layer (12a) and said surface of said 
wafer ( 1 3), said optical system ( 1 4) having an optical axis (OA 1 4a) and a cone of acceptance angles disposed 
about said optical axis (OA; 14a); 

a camera (16) coupled to said optical system (14) for obtaining an image from the collected light: 
a first light source (22; LS1) for illuminating said layer (12a) with light; and 
a data processor (28a) having an input coupled to an output of said camera (16), 
characterized by 


said first light source (22; LS1) emitting light that is directed along said optical axis (OA; 14a) and within said 
cone of acceptance angles; 

at least one second light source (18; LS2, LS3) for illuminating said layer (12a) with light that is directed off 
said optical axis (OA: 14a) and outside of said cone of acceptance angles; and 

said data processor (28a) obtaining from said camera (16) first pixel data corresponding to at least one first 
image obtained with light from said first light source (22; LS1 ). and obtaining second pixel data corresponding 
to at least one second image obtained with light from said at least one second light source (18; LS2, LS3), 
said data processor (28a) including means (28b. 28c) for generating an image mask from said second pixel 
data for distinguishing said first wafer regions from said second wafer regions, and further including means 
for detecting a thickness (t) of said at least one layer (1 2a) within said second regions in accordance with said 
first pixel data and in accordance with predetermined reference pixel data. 

2. The apparatus of claim 1 , characterized in that said optical system (14) is comprised of a telecentric optical system. 

3. The apparatus of claim 1 or 2, characterized in that said first light source (22: LSI) provides illumination with a 
first incidence angle on said layer (12a), the first incidence angle being an angle that causes specularly reflected 
light to enter said cone of acceptance angles of said optical system (14), and that said second light source (18; 
LS2, LS3) provides illumination with a second incidence angle on said layer (12a), the second incidence angle 
being an angle that causes specularly reflected light to not enter said cone of acceptance angles of said optical 
system (14). 

4. The apparatus of any of claims 1 - 3, characterized in that said first light source (22; LS1 ) includes means (24) for 
sequentially illuminating said layer (12a) with light having different predetermined wavelengths, that said camera 
(16) obtains a plurality of first images, individual ones of the plurality of first images being obtained with light having 
one of the predetermined wavelengths, that said data processor (28a) includes means (28c). responsive to each 
of said plurality of first images, for comparing associated first pixel data values corresponding to one or more of 
said second regions with individual ones of a plurality of sets of predetermined image pixel values, each of the 
plurality of sets corresponding to a different layer thickness (t), and that said data processor (28a) further includes 
means for selecting as a layer thickness value (t) a thickness associated with a set that gives a best match with 
the first pixel data values. 

5. The apparatus of any of claims 1 - 4, characterized in that said at least one second light source is comprised of 
first and second second light sources (LS2. LS3) that are disposed for illuminating said layer (12a) from different 
directions. 


6. The apparatus of any of claims 1 - 5. characterized by means (19) for varying a polarization state of said second 
light source (18). 

7. The apparatus of claim 1. characterized by a reference surface (RS; 30) that is disposed within a field of view of 
said camera (16), said reference surface (RS; 30) being oriented with respect to said surface of said wafer (13) 
for specularly reflecting light from said second light source (18; LS2, LS3), and wherein said image mask generating 
means is responsive to second pixel data corresponding to an image of said reference surface (RS; 30) for thresh- 
olding said second pixel data into first image mask regions corresponding to said first wafer regions and second 
image mask regions corresponding to said second wafer regions. 
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8. A method for determining a thickness (t) of a layer (12a) disposed over a surlace of a substrate (13) comprising 
the steps of: 

illuminating the layer (1 2a) with light having a first incidence angle: 

5 

obtaining at least one first image of the layer (1 2a) with light reflecting from the layer (1 2a) and the substrate 
(13); 

illuminating the layer (12a) with light having a second incidence angle; 

io 

obtaining at least one second image of the layer ( 1 2a) with light reflecting from the layer (1 2a) and the substrate 
(13), the reflected light being primarily due to a presence of features within the layer (12a) or substrate (13) 
that scatters and/or diffracts the light having the second incidence angle; 

15 - determining from the at least one second image a location of one or more regions of the layer (1 2a) or substrate 

(13) having the features and a location of one or more regions of the layer (12a) or substrate (13) not having 
the features: and 

determining, in accordance with the at least one first image, a thickness (t) of the layer (12a) only within the 
20 one or more regions not having the leaiures. 

9. The method of claim 8, characterized in that the first incidence angle is an angle that causes specularly reflected 
light to enter an acceptance cone of an optical system (14) used to obtain the first image, and that the second 
incidence angle is an angle that causes specularly reflected light to not enter the acceptance cone of the optical 

2S system (14) used to obtain the second image. 

10. The method of claim 8, characterized in that 

the step of illuminating the layer (12a) with light having a first incidence angle includes the sub-step of se- 
30 quentially illuminating the layer (12a) with light having different predetermined wavelengths: 

that the step of obtaining at least one first image of the layer includes the sub-step of obtaining a plurality of 
first images, individual ones of the plurality of first images being obtained with light having one of the prede- 
termined wavelengths; and 

35 

that the step of determining includes the substeps of: 

- for each of the plurality of first images, comparing image pixel values corresponding to the one or more 
regions not having the features with individual ones of a plurality of sets of predetermined image pixel 
40 values, each of the plurality of sets corresponding to a different layer thickness; and 

selecting as a layer thickness value (t) a thickness associated with a set that gives a best match with the 
image pixel values. 

45 
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(57) A layer thickness determination system (10) is 
employed for detecting a thickness of at least one layer 
(12a) disposed over a surface of a wafer (13) having 
one or more first regions characterized by circuit and 
other features, and one or more second regions char- 
acterized by an absence of circuit and other features. 
The determination system (10) includes an optical sys- 
tem (14) for collecting light reflecting from the at least 
one layer (12a) and the surface of the wafer (13). The 
optical system (14) is preferably a telecentric optical 
system, and has an optical axis (OA; 1 4a) and a narrow 
cone of acceptance angles disposed about the optical 
axis (OA; 14a). The determination system (10) further 
includes a camera (16) coupled to the optical system 
(14) for obtaining an image from the collected light; a 
first light source (22) for illuminating the layer ( 1 2a) with 
light that is directed along the optical axis (OA: 1 4a) and 
within the cone ol acceptance angles; and at least one 
second light source (18) for illuminating the layer (12a) 
with light that is directed of! the oplical axis (OA; 14a) 
and outside of the cone of acceptance angles. A data 
processor has an input coupled to an output of the cam- 
era (16) lor obtaining from the camera (16) first pixel 
data corresponding to at least one first image obtained 
with light from the first light source (22) and for obtaining 
second pixel data corresponding to at least one second 
image obtained with light from the at least one second 
light source (18). The data processor operates to gen- 
erate an image mask from the second pixel data for dis- 
tinguishing the first wafer regions from the second wafer 
regions, and further operates to detect a thickness of 
the at least one layer (12a) within the second regions in 
accordance with the first pixel data and in accordance 


with predetermined reference pixel data (Fig 1) 
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